Protein expression and structure solution using specific fusion vectors

ABSTRACT

A recombinant protein comprised of an amino acid sequence of a motor protein, a target protein of interest, and optionally, a linker sequence between the two proteins, is disclosed. Also disclosed are a DNA sequence encoding such a recombinant protein, a vector expressing such a recombinant protein, a host cell transformed with such a vector, a method for producing such a recombinant protein, and methods for purification, crystallization and structure elucidation of such a recombinant protein. In preferred embodiments of the invention, the motor protein is a member of the kinesin or myosin superfamilies, or an analog, fragment or derivative thereof.

CLAIM FOR FOREIGN PRIORITY

[0001] This application claims priority from European Patent Application01100762.2, filed Jan. 12, 2001. The entire contents of the priorapplication is incorporated herein by reference.

Reference to Sequence Listing

[0002] This application includes a “Sequence Listing” provided incomputer readable form, “PatentIn Ver. 2.1,” the entire contents ofwhich is incorporated herein by reference. A paper copy of the “SequenceListing” is also provided.

BACKGROUND OF THE INVENTION

[0003] The present invention relates to a recombinant protein comprisedof an amino acid sequence of a motor protein, a target protein ofinterest, and optionally, a linker sequence between the two proteins.The invention also relates to a DNA sequence encoding such a recombinantprotein, a vector expressing such a recombinant protein, a host celltransformed with such a vector, a method for producing such arecombinant protein, and methods for purification, crystallization andstructure elucidation of such a recombinant protein.

[0004] The first step, and perhaps the single most important step, inthe crystallization of a macromolecule, e.g., a protein, is itspurification. Any impurities of the protein solution to be used forcrystallization may impair crystal quality or, even worse, preclude theformation of crystals at all.

[0005] Procedures for accomplishing the highest degree of purificationpossible have been under development for more than 200 years, and recenttimes have seen an explosion in the invention of new methods andrefinement of old. There is a variety of methods that exemplify howproblems of protein purification for protein analysis or proteincrystallization have been approached.

[0006] One such method is fractionation with salts and otherprecipitants. Hereby, proteins are precipitated from a complex mixture(e.g., a physiological fluid) by addition of various concentrations ofdifferent salts. Because individual proteins precipitate at differentsalt concentrations, this “salting” out phenomenon provided a method forselectively precipitating, and thereby purifying, unique proteins from amixture (Morris and Morris, “Separation methods in biochemistry,”Pitman, London, GB, 1964). A minor disadvantage of salt fractionation isthat protein preparations, be they supernatants or precipitates, areleft with high residuals of salt. This may seriously interfere with theevaluation of activity, purity and with subsequent purificationprocedures. The most common of these methods is dialysis in celluloid orcollodion tubes.

[0007] Apart from varying the concentration of a salt, proteins may beselectively precipitated and fractionated by the addition of a varietyof organic solvents (Cohn et al., “Crystallization of serum albumin fromethanol/water mixtures”, J. Am. Chem. Soc., 69:1753, 1947). This isgenerally carried out at sub-zero temperatures ranging to −30° C. toenhance the precipitation effect and to minimize the denaturation of theprotein. In addition to salt and organic solvents, other materials havebeen used to precipitate and fractionate a mixture of proteins. Some ofthese materials are, for example, protamine (a mixture of small basicproteins) and polyeneimine (a basic organic polymer), which apparentlycross-links some protein via electrostatic bridges. Moreover, metal ionsor organic polymers, such as polyethylene glycol (PEG), were extensivelyused for purification purposes. PEG seems to act as a hybrid between analcohol and a salt and their precise properties may vary as a functionof mean polymer length.

[0008] Still another method of protein purification is the selection ofproteins with heat or pH. pH is effective because most proteins exhibitpH-dependent solubility minima and precipitate or even crystallize fromsolution at particular values, whereas the property of protein heatstability may sometimes provide a valuable purification step.

[0009] Other protein purification methods based on physical techniquesare also well-known to a person skilled in the art. Centrifugation, forexample, has to be mentioned, whereby a solution containing multiplecomponents varying in weight, size, and density is deployed in a tubeand rotated at high angular velocity. An almost preparativecentrifugation is conducted on some gradient with various density fromthe top to the bottom of the centrifuge tube. Two common techniquesutilized in connection with density gradient separation aresedimentation velocity and sedimentation equilibrium or isocyaniccentrifugation. Furthermore, electrophoretic separation methods(Svensson, “Preparative electrophoresis and ionophoresis,” Adv. ProteinChem., 4:251, 1947) are routinely used and are based on the applicationof an electrical field across and insoluble, porous support mediumpermeated by a buffer solution. Dependent on the net charge of theproteins to be separated they will experience and electromotive forceand migrate toward one electrode (cathode or anode). For the separationof proteins, polyacrylamide gels as support medium have shown to havealmost ideal properties.

[0010] Finally, chromatographic methods are especially well suited toseparate proteins and to purify the target protein for latercrystallization steps. Classic ion exchange chromatography is simplyconducted by packing a vertical hollow glass column with an insolubleresin or colloidal matrix that exhibits an array of positively charged(anion exchange chromatography) or negatively charged chemical groups(ration exchange chromatography). Ion exchange chromatography is basedon the fact that a positively charged protein will be retarded or boundto electrostatic interactions with a matrix carrying negatively chargedgroups or vice-versa for negatively charged proteins. Dependent on theirrespective net charge the proteins to be separated will appear in theeluent sequentially with time (or volume). Molecules tightly bound tothe matrix may be eluted from the column by competition with othercharged ions. In contrast to ion exchange chromatography, molecularsieve chromatography (also called gel permeation chromatography)separates molecules on the basis of molecular weight and shape. Hereby,macromolecules, like proteins, are induced to flow by gravity orpressure through a column containing a matrix of microscopic beadsperforated with a vast network of channels. Thereby, the high molecularsieving effect will influence the speed in passing from the top to thebottom of the column leading to the inverse effect that larger moleculeswill appear first in the column eluent. Finally, absorptionchromatography, HPLC (high performance liquid chromatography), andaffinity chromatography are also well established as biochemicalpurification methods.

[0011] All of the above-mentioned methods exhibit certain advantages anddisadvantages. Consequently, the person skilled in the art will choosethe purification method which appears to be most appropriate for a givensystem.

[0012] In recent years many purification methods have begun to takeadvantage of recombinant proteins (Kane and Hartley, “Development ofexpression systems for production of high levels of protein,” TrendsBiotechnol., 6:95, 1988). Recombinant proteins are produced byrecombinant DNA techniques in bacteria, yeast or other organisms such asvirus infected mammalian or insect cells. The advantage of recombinantproteins is based on genetically designed elements, that aid thebiochemist in applying one of the aforementioned physical or biochemicalpurification methods. For example, a series of histidine residues, or a“His-tag”, may be appended to the carboxyl terminus of a recombinantprotein. Such a histidine appendix makes it easier to isolate theexpressed protein on a copper or nickel containing chromatographicresin, the latter being available commercially in prepacked columns.

[0013] A second procedure in wide use for the purification ofrecombinant proteins is the fusion of an expressed protein with theenzyme glutathione sulfur transferase (GST). This enzyme has a very highaffinity for the small peptide glutathione. Following expression of theprotein, an extract of the cells is passed over a small chromatographycolumn containing a matrix conjugated with glutathione. The chimericprotein is then reversibly bound on the column through the GST,contaminants are washed from the column, and finally the recombinantprotein is eluted with free glutathione and collected. The GST may thenbe cleaved from the chimer by a specific protease to produce the freerecombinant protein. Again, the chromatographic matrix may be obtainedcommercially in prepacked columns.

[0014] Furthermore, e.g., pMAL™ (by New England Biolabs Inc.) is used asprotein fusion and purification system prior art. This system comprisesthe insertion of the cloned gene into a pMAL™ vector downstream from themalE gene, which encodes maltose binding protein (MBP). The fusionprotein (target protein and MBP) is expressed in large quantities andpurified by affinity chromatography for MBP using amylose resin.Finally, MBP is cleaved from the target protein by a specific protease.

[0015] These techniques utilizing recombinant proteins allow one toobtain extraordinarily pure fractions of the target protein. However,advantageous conditions for purification, crystallization and structuralanalysis have to be tested using the MBP/target protein or GST/targetprotein fusion systems for each single recombinant and/or targetprotein. In particular, there are still complex chromatographic (invitro) purification steps required for obtaining pure fractions of thetarget protein and further steps of analysis, like crystallization orstructure determination, are complicated by the unknown properties ofthe target protein, such as, for example, the crystallization conditionsof a specific target protein purified as MBP or GST fusion protein.

SUMMARY OF THE INVENTION

[0016] The object of the present invention is to overcome theabove-mentioned disadvantages of the prior art, and particularly toprovide a system that considerably reduces the time effort forpurification and subsequent crystallization as well as structuredetermination of any protein to be analyzed.

[0017] The principles of the present invention provide a recombinantprotein comprised of an amino acid sequence of a motor protein, a targetprotein of interest, and optionally, a linker sequence between the twoproteins. The invention also relates to a DNA sequence encoding such arecombinant protein, a vector expressing such a recombinant protein, ahost cell transformed with such a vector, a method for producing such arecombinant protein, and methods for purification, crystallization andstructure elucidation of such a recombinant protein.

[0018] In one aspect, the present invention provides recombinantproteins comprising:

[0019] (1) an amino acid sequence of a member of the myosin or kinesinprotein superfamilies or an amino acid sequence of an analog, fragmentor derivative of a member of the myosin or kinesin proteinsuperfamilies;

[0020] (2) any amino acid sequence of at least 20 amino acids in length(target protein sequence); and optionally,

[0021] (3) a linker region of at least 2 amino acids between components(1) and (2).

[0022] It is within the scope of this invention that components (1) and(2) are directly fused together without insertion of linker sequence(3).

[0023] In accordance with the principles of the invention, component (1)may comprise any protein or fragment, derivative or analog thereof,which binds to any molecule or structure of the cytoskeleton or a cellmembrane in a ligand dependent manner. Particularly preferred aremolecules, which exhibit a flexible region, particularly at themolecules C-terminal region, in order to sample for multipleconformations.

[0024] Component (1) may also comprise an amino acid sequence of ananalog, fragment or derivative of a member of the myosin or kinesinprotein superfamilies. The preparation of such analogs, fragments andderivatives is by a standard procedure (Sambrook et al., “MolecularCloning: A Laboratory Manual,” Cold Spring Harbor, N.Y., 1989) in whichin the DNA sequences encoding the inventive recombinant protein, one ormore codons may be deleted, added or substituted by another, to yieldanalogs having at least one amino acid residue change with respect tothe native recombinant protein, particularly with respect to the nativeamino acid sequence of component (1) or (2) of the recombinant proteinof the invention.

[0025] Analogs that substantially correspond to the native sequence ofone or more components of the inventive recombinant protein are thosepolypeptides, in which one or more amino acids of the native protein'samino acid sequence has/have been replaced by another amino acid,deleted and/or inserted.

[0026] In a preferred embodiment of the present invention, the resultingcomponents ((1) or (2)) being incorporated into the recombinant proteinof the invention exhibit substantially the same or even higherbiological activity as the corresponding native protein to which itcorresponds or exhibit at least structurally similar properties as thenative protein to which the component corresponds. In order tosubstantially correspond to the native sequence of component (1) or (2)of the recombinant protein of the invention, the changes in the sequenceof the components are generally and preferably relatively minor, such asisoforms. Although the number of changes may be more than 10, preferablythere are no more than 10 changes, more preferable no more than 5 andmost preferably no more than 3 changes in component (1) or (2) ascompared to the respective native sequence.

[0027] While any technique may be used to find potentially biologicallyactive sequences of a component of the inventive recombinant protein,which substantially correspond to the respective native proteins, onesuch technique is the use of conventional mutagenesis techniques on theDNA encoding the protein, resulting in a few modifications. Thesequences used for component (1) or (2) in the recombinant protein ofthe invention which are expressed by such clones, may then be screenedfor their ability e.g., to bind to their native binding partners,mediate activity etc., in other words fulfil their biological role.

[0028] Conservative “changes” are those changes which would not beexpected to change the activity of the protein and are usually the firstto be screened as these would not be expected to substantially changesize, charge or structure of the polypeptide sequence used as componentin the recombinant protein of the invention and thus would not beexpected to change the biological properties of the corresponding nativesequence. For example, conservative substitutions are assumed, if: (a)small aliphatic, non-polar or slightly polar residues are substituted byother residues belonging to the same group; (b) polar negatively chargedresidues and their amides are exchanged for other residues belonging tothe same group; (c) polar positively charged residues are exchanged forpolar positively residues; (d) large aliphatic non-polar residues areexchanged for large aliphatic non-polar residues; or (e) finally,aromatic residues are substituted by other aromatic residues.

[0029] In most cases, in the context of the present invention, analogsbeing used as component (1) or (2) of the recombinant protein of theinvention are defined as sequences with substitutions which do notproduce radical changes in the characteristics of the correspondingnative protein or polypeptide molecule. Characteristics may be thespecific secondary structure of a sequence, e.g., α-helix or β-sheet, aswell as its specific biological activity.

[0030] It is noted that apart from sequences being used as component (1)or (2) for a recombinant protein according to the present invention,which are based on conservative substitutions as discussed above,analogs with more random changes, which lead to a radical or moreradical change in biological activity or structure of the analog ascompared to the native sequence are also within the scope of the presentinvention.

[0031] At the genetic level, these analogs are generally prepared bysite-directed mutagenesis of nucleotides in the DNA encoding theinventive recombinant protein or the component of the recombinantprotein, respectively, thereby producing DNA encoding the analog andthereafter synthesizing the DNA and expressing the polypeptide inrecombinant cell culture. Reference is made to Ausübel et al., “CurrentProtocols in Molecular Biology,” Green Publications and WileyIntersigns, New York, N.Y., 1987-1995; and Sambrook et al., “MolecularCloning: Laboratory Manual,” Cold Spring Harbor Laboratory, New York,1989, the entire disclosures of which are incorporated herein byreference.

[0032] Furthermore, site-specific mutagenesis allows the production ofanalogs through the use of specific oligonucleotide sequences thatencode the DNA sequence of the desired mutation. The technique ofsite-directed mutagenesis is exemplified by publications such as Adelmanet al., DNA, 2:183 (1983), the entire disclosure of which isincorporated herein by reference. Typical vectors useful insite-directed mutagenesis include vectors such as M13-phage, for exampleas disclosed by Messing et al., “3rd Cleveland Symposium onMacromolecules and recombinant DNA,” editor A. Walton, Elsevier,Amsterdam (1981), the entire disclosure of which is incorporated hereinby reference.

[0033] As far as derivatives of the native sequence of components of therecombinant protein of the present invention are concerned, derivativesmay be prepared by standard modifications of the side groups of one ormore amino acid residues of the recombinant protein of the invention,its analogs or fragments or by conjugation of the native sequence usedas component (1) or (2) of the inventive recombinant protein, itsanalogs or fragments, to another molecule, e.g., an antibody, enzyme,receptor, etc.

[0034] Accordingly, “derivatives” as used herein cover derivatives thatmay be prepared from the functional groups occurring as side chains onthe residues or from the N- or C-terminal groups by means known in theart. Derivatives may have chemical moieties such as carbohydrates orphosphate residues. For example, derivatives may include aliphaticesters of the carboxyl groups, amides of the carboxyl group by reactionwith ammonia or with primary or secondary amines, N-acyl derivatives orfree amino groups of the amino acid residues formed with acyl moietiesor O-acyl derivatives of free hydroxyl groups (for example of seryl orthreonyl residues) formed with acyl moieties. The term derivative willalso include, all polypeptide sequences for a particular component ((1)and/or (2)) of the recombinant protein sequence which are larger insequence than the corresponding native sequence. The addition of atleast one, typically more than 10 amino acids may take placeintrasequentially or at the N- or C-terminus of the sequence ofcomponent (1) and/or (2) of an inventive recombinant protein. In apreferred embodiment of the present invention, additional amino acidsare appended to the N-terminus of component (1) or the C-terminus ofcomponent (2) coinciding with the N-terminus and the C-terminus of theinventive recombinant protein.

[0035] In another preferred embodiment, additional amino acid sequencesare inserted intrasequentially, preferably in such a way that thesecondary and/or tertiary structure is not destroyed. Typically theseinsertions are placed at the surface of the protein, e.g., in β-bends.Preferably, one or more S-containing residues (particularly Cys) areinserted or other residues with a potential for binding heavy metalatoms (e.g., Hg-ions). The introduction of additional heavy metalbinding residues at sites on the surface of the recombinant protein ofthe invention may be by substitution and/or deletion of native bindingresidues in order to create novel heavy metal atom binding sites. Such aprocedure is particularly suitable for gaining additional phasinginformation for structure determination of large protein complexes byX-ray crystallography.

[0036] In a non-limiting manner, “tag”-sequences may be contained in therecombinant protein and, particularly, may be added to the N- orC-terminus of the recombinant protein of the invention. These“tag”-sequences typically have antigenic character for commerciallyavailable antibodies, e.g., an N-terminal “Flag-tag” having the sequenceDYKDDDDK (one-letter-code). Other suitable “tag”-sequences are, forexample, N- or C-terminal polyhistidine tags.

[0037] Furthermore, component (1) and/or (2) as parts of the recombinantprotein of the invention may be fusion proteins. Particularly preferredare sequences fused to the N-terminus of the native sequence ofcomponent (1) or to the N-terminus of an analog, derivative or fragmentthereof. For example, component (1) of the recombinant protein may befused N-terminally to a marker protein, e.g., an enzyme marker or afluorescence marker, such as GFP (green fluorescence protein), or anysequence being suitable as epitope for an antibody or even to anantibody or an antibody fragment itself.

[0038] Finally, “fragments” of the native sequence of any protein beingused as component (1) or (2) of the recombinant protein according to thepresent invention may be used, e.g., fragments of proteins of the myosinor kinesin protein superfamilies, particularly fragments being deletedC-terminally, the deletion comprising at least ten, and more preferablyat least 50 amino acids. However, the fragment of the native sequencemay also contain deletions at the N- and/or the C-terminus and/orintrasequentially in component (1) and/or component (2) of a recombinantprotein of the invention.

[0039] In a preferred embodiment, component (1) consists of a fragmentcomprising the catalytic domain of a member of the myosin or kinesinprotein superfamilies of any eukaryotic organism. In other words,component (1) corresponds preferably to a fragment containing the myosinor kinesin motor domain. Within the scope of the present invention aretherefore recombinant proteins characterized in that they contain ascomponent (1) an amino acid sequence for the motor domain of a kinesinor myosin family member or an analog, fragment or derivative thereof.

[0040] In a preferred embodiment of the present invention, therecombinant protein according to the present invention contains ascomponent (1) an amino acid sequence of a member of the myosin I, II,III, IV, V, VI, VIII, X, or XI or a member of kinesin I or II familiesor an amino acid sequence of an analog, fragment or derivative of amember of the aforementioned myosin and kinesin families. Preferably,component (1) contains a member of the myosin II family of anyeukaryotic organism or an analog, fragment or derivative thereof.

[0041] Still further preferred, component (1) contains myosin II ofDictyostelium or an analog, fragment or derivative thereof. Furtherpreferred embodiments of the present invention for component (1) areproteins containing the motor domains of smooth muscle myosin II (e.g.,chicken gizzard myosin), vertebrate or amoeboid forms of myosin I(bovine brushborder myosin), Dictyostelium myoID, vertebrate myosin V,myosin VI, Toxoplasma gondii (e.g., TgMyoA) and Plasmodium sp. myosinXIV, vertebrate kinesin (human kinesin I), amoeboid or fungal kinesins(e.g., Dictyostelium kinesin 7).

[0042] Preferably, a recombinant protein according to the presentinvention contains as linker component (3) a stretch of at least 3 aminoacids, more preferably 5 amino acids, and still further preferably, 10amino acids. Particularly preferred is a linker sequence which containsa protease cleavage site. A recognition sequence for any protease may beused, for example, the cleavage site may contain the recognitionsequence for factor Xa, thrombin or for the protease TEV (recognitionsequence: ENLYFQG) or the Soldati protease. However, as discussedpreviously, linker component (3) is optional, and it is within the scopeof this invention that components (1) and (2) are directly fusedtogether without insertion of a linker sequence.

[0043] If linker component (3) consists of three amino acids, it ispreferred to chose a sequence with at least one Gly residue,particularly in the second position of the linker stretch. Morepreferred, however, is a linker with the sequence: N-Leu-Gly-Arg-C orN-Leu-Gly-Ser-C.

[0044] As component (2) (the target protein), preferred recombinantproteins of the present invention may contain the sequence of anesterase, hydrolase, phosphatase, kinase, protease, channel, structuralprotein (e.g., coronin, spectrin), receptor, particularly a neuronal orimmunologically relevant receptor (e.g., superfamily of TNF receptors),transcription factor, DNA/RNA-binding protein, lipoprotein, glycoproteinor an analog, derivative or fragment thereof.

[0045] A recombinant protein according to the present invention may haveas component (1) an amino acid sequence as exhibited in FIG. 6 (SEQ IDNO. 1) or an analog, derivative and/or fragment thereof. It is preferredto combine the sequence of FIG. 6 with a linker sequence (3) containinga protease recognition site as exemplified above or the amino acidsequence Leu-Gly-Ser. Still further preferred is a recombinant proteinhaving a sequence as shown in FIG. 7 (SEQ ID NO. 2).

[0046] A second aspect of the present invention relates to a DNAsequence which contains a sequence which codes for an amino acidsequence of a recombinant protein according to the present invention. Inparticular, the present invention provides a DNA sequence selected fromthe group consisting of:

[0047] (a) a cDNA sequence derived from the coding region of arecombinant protein according to the present invention;

[0048] (b) DNA sequences capable of hybridization to a sequence of (a)under moderately stringent conditions; and

[0049] (c) DNA sequences which are degenerate as a result of the geneticcode to the DNA sequences defined in (a) and (b), above.

[0050] Another specific embodiment of the above DNA sequence of theinvention is a DNA sequence comprising at least part of a sequenceencoding for a recombinant protein as depicted in FIG. 8 (SEQ ID NO. 3)particularly the segment of FIG. 8 which codes for the myosin motordomain. Nucleic acid stretches encoding for a recombinant protein of thepresent invention may be detected, obtained and/or modified, in vitro,in-situ and/or in vivo, by the use of known DNA or RNA amplificationtechniques, such as polymerase chain reaction (PCR) and chemicaloligonucleotide synthesis.

[0051] PCR allows for the amplification (increase in number) of aspecific DNA sequence by repeated DNA polymerase reactions. Thisreaction may be used as a replacement for cloning. All that is requiredis a knowledge of the nucleic acid sequence. In order to carry out PCR,primers are designed which are complementary to the sequence ofinterest. The primers are then generated by automated DNA synthesis.Because primers may be defined to hybridize to any part of the gene,conditions may be created such that mismatches in the complementary basepairing may be tolerated. Amplification of these mismatch regions maylead to the synthesis of a mutagenized product resulting in thegeneration of a polypeptide with new properties (site-directedmutagenesis).

[0052] By coupling complementary DNA (cDNA) synthesis, using reversetranscriptase, with PCR, RNA may be used as the starting material forthe synthesis of a recombinant protein of the invention. Furthermore,PCR primers may be designed to incorporate new restriction sites orother features such as termination codons at the end of the segment tobe amplified. This placement of restriction sites at the 5′ and 3′ endsof the amplified nucleic sequence allows for a gene sequence including arecombinant protein of the invention or a fragment thereof to be customdesigned for ligation with other sequences and/or cloning sites invectors.

[0053] PCR and other methods of amplification of RNA and/or DNA are wellknown in the art and may be used according to the present inventionwithout undue experimentation. Known methods of DNA and RNAamplification include PCR and related amplification processes (Innes etal., PCR Protocols: A Guide to Method and Amplification) and RNAmediated amplification which uses antisense RNA to the target sequenceas a template for double stranded DNA synthesis (see, e.g., U.S. Pat.No. 5,130,238, the entirety of which is incorporated herein byreference).

[0054] In an analogous fashion, a recombinant protein of the inventionbeing composed of components 1, (2) and (3) as defined above may beprepared, whereby components (1), (2) and (3) are ligated on a geneticlevel forming a DNA sequence of the invention, which is used to expressa recombinant protein of the invention in a suitable host system.

[0055] Also provided by the present invention are vectors encoding theabove recombinant protein, and analogs, fragments or derivatives of theinvention, which contain the above DNA sequence of the invention. Suchvectors are capable of being expressed in suitable eukaryotic orprokaryotic host cells. Particularly preferred are vectors of theinvention, which are capable of being expressed in cells of the speciesDictyostelium.

[0056] In an expression vector of the present, invention the DNAsequence is operably linked to a promoter, preferably linked upstream.The promoter will preferably be an eukaryotic promoter, particularly aconstitutive promoter. The transcription of a DNA sequence according tothe invention in cells of higher eukaryotes may be derived from viralgenomes. Examples would be polyoma viruses, retroviruses, adenoviruses,cytomegaloviruses, SV40 and the like. With mammalian cells, apossibility would be the β-actin promoter. In the current invention, theactin15 promoter is particularly preferred for expression inDictyostelium.

[0057] If appropriate, other regulating elements of transcription and/ortranslation will be provided. Particularly preferred are cis-actingelements, such as enhancer sequences, which usually include 10 to 300base pairs and act upon the promoter to raise the transcription rate.These may be arranged in the 3′ or 5′ position of the DNA sequenceaccording to the invention, in the coding sequence itself, or in anintron sequence which is cut out by splice procedures. Furtherregulating elements may serve to regulate transcription termination, sothat the expression of mRNA is involved.

[0058] If necessary, the expression vector with the DNA of the inventionare developed as shuttle vectors, that is, they are able to replicate ina host system and can then be transfected into another host system forpurposes of expression. For instance, a vector might first be cloned inE. coli and then be inoculated into Dictyostelium, yeast or anymammalian cell for expression.

[0059] Typically, such expression and cloning vectors include at leastone selection gene exercising a marker function. A selection gene allowshost cells to survive or grow after being transformed by the vector.Typical selection genes code for proteins that permit resistance towardantibiotics or other toxins. This, for instance, includes puromycin,ampicillin or neomycin.

[0060] The principles of the present invention also provide host cells,and particularly eukaryotic host cells, transformed with an expressionvector according to the invention. Appropriate host cells for cloning orexpressing the DNA sequences are prokaryotic cells, yeast or highereukaryotic cells. In a preferred embodiment, cells for expressing DNAsequences according to the invention are selected from multicellularorganisms. This also takes place before the background of the functionof component (1) of the recombinant protein of the invention to elementsof the cytoskeleton (e.g., actin, microtubules or components of the cellmembrane or membrane of any intracellular organelle, e.g.,mitochondria). In principle any eukaryotic cell may be used as hostcell, although cells of mammals such as monkeys, mice, rats, hamster orhumans, are preferred. Particularly preferred are cells from the speciesDictyostelium.

[0061] The present invention relates in a further aspect to a method forproducing a recombinant protein according to the invention, the methodcomprises the following steps:

[0062] (a) preparing a vector according to the invention;

[0063] (b) transforming eukaryotic host cells with a vector obtainablefrom step (a); and

[0064] (c) growing transformed host cells of the invention andobtainable from step (b) under conditions suitable for the expression ofthe recombinant protein.

[0065] The expression method of the invention allows for overexpressionof any target protein or polypeptide of at least 20 amino acid length(component (2)), as a segment of the recombinant protein of theinvention. Accordingly, huge amounts of target protein as part of arecombinant protein of the invention are produced by the method of theinvention. It is preferred within the scope of the present invention toconcentrate the overexpressed recombinant protein in the cell. This isachieved by constructing recombinant proteins of the invention, which donot carry any leader sequences for secretion out of the transformed hostcell.

[0066] Another aspect of the present invention is a method for purifyinga recombinant protein of the invention or any other recombinant proteincontaining an amino acid sequence binding to cytoskeleton (actin ormicrotubules or proteins being bound to actin in the cell) or membrane(e.g., inner cell membrane or outer or inner membrane of a cellorganelle) structures and another amino acid sequence (the targetsequence to be analyzed), the method comprises:

[0067] (a) preparing a vector according to the invention or a vectorencoding for any recombinant protein (as disclosed above);

[0068] (b) transforming eukaryotic host cells with a vector obtainablefrom step (a);

[0069] (c) growing transformed host cells according to the inventionand/or obtainable from step (b) under conditions suitable for theoverexpression of the recombinant protein;

[0070] (d) purifying overexpressed recombinant protein by binding toendogenous elements or structures of the cytoskeleton or membrane, suchas actin or microtubules, of the eukaryotic host cell; and

[0071] (e) releasing bound recombinant protein from these structures orelements, preferably actin or microtubules.

[0072] In a preferred embodiment of this method, step (e), the releasingstep, involves a separation from the structures or elements of the cellby adding a substrate, be it a natural or non-natural substrate, ofcomponent (1) of the recombinant protein. Whereas in general the naturalsubstrate will be used, it may be preferable in certain cases to use anon-natural substrate of component (1), such as, for example, GTP or(nucleotide) analogues (where ATP is the natural substrate), forreleasing purposes. In general, any substrate with the potential torelease the bound recombinant protein, particularly by binding to thecomponent (1) of the recombinant protein from the cell structure orelement is suitable to be used for step (e). It will be appreciated thata method of the invention using a member of the kinesin or myosinsuperfamily or a derivative, fragment or analog thereof as component (1)is particularly preferred, if it is characterized in the addition ofATP, which is the natural substrate for these proteins with motilityfunction.

[0073] In yet a further preferred embodiment, the purification method ofthe invention comprises an additional step (f). Step (f) may typicallyprovide at least one additional in vitro purification step, whereby allcommon purification procedures available may be provided, for instanceall procedures described by A. Mc Pherson, “Crystallization ofBiological Macromolecules,” Cold Spring Harbor Laboratory Press, NY,1999, the entire contents of which is incorporated herein by reference.

[0074] In a non-limiting manner, the following methods, particularlybiochemical and/or physical methods, may be used or combined: saltfractionation, desalting, fractionation with organic solvents or withother precipitants, selection with heat/pH, centrifugation,chromatographic methods, e.g., ion exchange chromatography, molecularsieve chromatography, adsorption chromatography, affinity chromatographyor HPLC, ultrafiltration, isoelectric focusing and/or electrophoresis bybiochemical, particularly chromatographic, and/or physical methods.

[0075] Affinity chromatography is particularly preferred, whereby metals(e.g., Ni)and/or antibodies are typically bound to a resin as ligands.The affinity chromatography may typically be carried out in batch modeor by a column packed with an insoluble support matrix.

[0076] A further aspect of the present invention is a recombinantprotein, particularly in isolated and/or purified form, obtainable froma method for producing of the recombinant protein of the invention asdescribed herein.

[0077] A still further aspect of the present invention is a method forcrystallizing a recombinant protein of the invention, wherein the methodcomprises (a) a purification step according to a method of the inventionand (b) a crystallization step. Hereby, the purified recombinant proteinobtained in step (a) is crystallized by any method known by the skilledperson. The crystallizing step will be carried out under conditionssuitable for crystal growth. The conditions may be optimized by varyingcertain parameters, such as stock solution, concentration of therecombinant protein, temperature, pH, ionic strength, precipitatingagent (e.g., ammonium sulfate or PEG), addition of small amounts oforganic solvents, etc. However, the conditions used for crystallizationof component (1) alone are preferred, which means that the conditionssuitable for a member of the myosin or kinesin superfamily or afragment, analog or derivative thereof may also work to identifycrystals of the recombinant protein of the invention.

[0078] In order to accelerate the crystallization process, it isparticularly preferred to apply a recombinant protein of the inventioncontaining as component (1) an amino acid sequence with a flexibleregion, particularly a flexible region at C-terminal end of (1). Thus, ahigh degree of flexibility of the components is achieved resulting innumerous conformations which can be occupied or sampled by thecomponents in the course of the crystallization process.

[0079] It is preferred to employ vapor diffusion techniques either bythe hanging or the sitting drop method to obtain crystals. Furthermore,crystallization may be achieved by induction of nucleation. Exemplarymacro- or microseeding methods are described by A. Mc Pherson,“Crystallization of Biological Macromolecules,” Cold Spring HarborLaboratory Press, NY, 1999, the contents of which is incorporated byreference.

[0080] Another aspect of the present invention is a protein crystalbuilt by a network of recombinant proteins according to the invention.This network forms the crystal lattice. Within the scope of the presentinvention are crystals of any space group in which identical proteinscan be arranged. A crystal of the invention may contain one, two, threeor more recombinant proteins per asymmetric unit. At least one heavyatom may be located at a particular position or positions in therecombinant protein being arranged symmetrically in the crystal of theinvention. Crystals may contain ligands non-covalently bound to thecrystallized recombinant protein as well, e.g., ATP, inhibitors, alkaliions or physiological ligands, such as hormones, carbohydrates, proteinfragments. etc.

[0081] Finally, an aspect of the present invention is a method forelucidating the atomic structure of a protein crystal of the invention,whereby, after a crystallization step (a) according to the invention,X-ray diffraction data are collected on a beamline or any kind of devicesuitable for measuring locations of X-ray reflections (diffractometer,(b)). In final step (c), the atomic structure or rather the electrondensity map (into which the polypeptide chain and, eventually, otherligands and water molecules are modeled) of a recombinant protein iscalculated by Fourier transformation of the data set obtained in step(b) using phasing information obtained by anomalous scattering, theheavy atom method or molecular replacement techniques, as e.g.,described by Stout & Jensen, X-ray Structure Determination, Wiley, NY,1989, which is incorporated herein by reference.

[0082] For the present invention, molecular replacement methods areparticularly useful. The phasing information may be obtained fromcomponent (1) as starting model, which is typically a structurally welldetermined polypeptide. Therefore, component (1) is a “helper” sequenceproviding the starting information to solve the structure of therecombinant protein or the structure of component (2), respectively,which is the target protein to be structurally analyzed. Further roundsof structure refinement by methods known by the skilled person ordescribed by Stout & Jensen may serve to improve the structure model.Additionally, heavy atoms may be bound to known sites of component (1)of the recombinant protein of the invention. Thereby, additional phasinginformation may be obtained for structure elucidation of targetcomponent (2) (which is under analysis) of the recombinant protein ofthe invention.

[0083] The use of a recombinant protein of the invention forpurification and crystallization purposes has unprecedented advantagesover the methods known in the art. The recombinant protein via itscomponent (1) binds to insoluble components of the cell, like thecytoskeleton, membrane components or the like. Following cell lysis, therecombinant fusion protein (or rather its component (2), which is thetarget protein desired to be purified, analyzed or subjected to X-rayanalysis) can be enriched by ligand depletion and precipitation with theinsoluble interaction partners of the cell. This allows for apurification step already carried out in the cell without anyadditional. Therefore, it is not the lysate as a whole which containsthe overexpressed protein but the pre-purified precipitate itself. Thespecific solubilization of the fusion protein is achieved by addition ofthe ligand to the insoluble fraction.

[0084] For crystallization, the conditions (parameters) are preferablychosen such that they coincide with the conditions for structurally wellcharacterized component (1). These conditions or subtle variations ofthese conditions are expected to work for the recombinant protein aswell. Hence, the method of the present invention for crystallizingallows one to find crystallization conditions without extensive searchfor suitable parameters required by the art.

[0085] It is, however, within the scope of the present invention that arecombinant protein of the invention or any other recombinant proteinwhich is purified according to a method of the present invention may bestructurally analyzed by any other method known by the skilled person.Particularly, such recombinant proteins may be subjected to NMR analysis(two-dimensional or multidimensional) as described by Roberts, NMR ofMacromolecules: A practical approach, Oxford-New York, 1993, which isincorporated herein by reference. Furthermore, the system of the presentinvention may be used for drug design (ligand to component (2) of therecombinant protein used) as described by Craik, NMR in Drug Design, CRCPress, Boca Raton, 1996, which is incorporated by reference. Othermethods of structure eclucidation are, for instance, mass spectrotometryas described by Siuzdak, Mass Spectrometry for Biotechnology, AcademicPress, San Diego, 1996, incorporated herein by reference.

[0086] Another aspect of the present invention is a method for isolatingand identifying proteins that are capable of binding to the targetprotein sequence (component (2)) in the recombinant protein(particularly of the invention). Therefore, a yeast-two-hybrid systemmay be used, by which a sequence encoding the recombinant protein iscarried by one hybrid vector and sequence carried by the second hybridvector, the vectors being used to transform yeast host cells and thepositive transformed cells being isolated, followed by extraction of thesaid second hybrid vector to obtain a sequence encoding a protein whichbinds to said recombinant protein directly or indirectly via otherproteins.

[0087] Yet another aspect of the invention provides an approach/methodsuitable for the identification of binding partners to the recombinantprotein of the invention. The method may comprise the following steps:

[0088] (a) a library of cDNA is typically fused to the C-terminus of(1), particularly of a myosin motor domain (MMD), (typically resultingin a recombinant protein of the invention), eventually via a linkersequence;

[0089] (b) the recombinant protein is expressed in Dictyostelium orother eukaryotic system;

[0090] (c) clonal transformants are probed with the bait-protein ofchoice fused to any marker protein, e.g., β-galactosidase; and

[0091] (d) after washing, identification and determination ofinteracting recombinant protein by measuring the activity of bait markerfusion protein, e.g., by addition of β-gal.

[0092] In a preferred embodiment, the myosin motor domain of step (a)may be His or epitope-tagged at the N-terminus. Typically all steps ofthis method of the invention are carried out in microtiter well plates.

[0093] Preferably, the recombinant protein shown to have bound to thebait-protein of choice may be purified by the methods of the inventionand can then be subjected to further biochemical or structuralcharacterization, e.g. crystallization as described above, with orwithout cleavage by a protease, if a recognition in a linker region hasbeen provided, in order to release component (2), the target protein.This aspect of the invention is suitable for the identification ofunknown binding partners and may also be used to demonstrate theinteraction between two known polypeptides.

[0094] The disclosed method of isolating yet unknown binding partners ofthe invention has numerous advantages over methods known in the art.MMD-fusion proteins, for example, may be easily purified fromDictyostelium and the MMD fusion system may be transferred to a widerange of high eukaryotic cells. Further advantages include: (i) theMMD-cDNA constructs may be directly used for expression in Dictyosteliumand other eukaryotic cells; (ii) decreased background (since the systemworks with purified proteins and not with proteins within a cellularenvironment that, as in the case of the yeast 2-hybrid-system, leads toa high background of false positive clones); (iii) easy identificationand isolation of the positive construct from mother plates; and (iv) theprocedure may be highly automated, since all steps in the interactionscreening may be performed in microtiter well plates.

BRIEF DESCRIPTION OF THE DRAWINGS

[0095]FIGS. 1A and 1B show the structure of M761-2R-R238E, an examplefor a recombinant protein of the invention. Although two molecules arepresent in the crystallographic asymmetric unit, only one is shown here.The two molecules are essentially identical throughout the myosin motordomain (residues 2-761) exemplifying component (1) of the recombinantprotein of the invention. However, upon leaving the converter domain,the lever arms assume slightly different orientations and deviate at theends by 19.4 Å.

[0096]FIG. 1A shows a complete molecule (recombinant protein of theinvention) spanning amino acids 2-1010. No electron density was observedfor five residues at the N-terminus, the loop region 205-208, and oneresidue at the C-terminus. The molecule comprises the N-terminal domain(2-200), 50 kDa domain (201-613), C-terminal and converter domain(614-761), linker region (762-764) (component (3)) of the recombinantprotein of the invention), α-actinin lever arm (765-1003) (component(2)) of the recombinant protein of the invention) and seven histidinesfrom the His purification tag (1004-1010), which are linked as specifiedfor an preferred embodiment of the present invention. The linker region(3) is composed of three residues (Leu-Gly-Arg) introduced duringcloning. The observed lever arm is ˜140 Å long (measured from Cα of 761to Cα of 1010). Each α-actinin repeat contributes ˜65 Å, and thehistidine purification tag another 10 Å. Helices 1-3 make up the firstα-actinin repeat, and 4-6 the second. The arrowhead indicates theα-helical region linking the two repeats. The disruptive kink in helix 2is caused by the presence of two adjacent proline residues (FIG. 5A).

[0097]FIG. 1B is a detailed view of the linker region joining the myosinconverter domain to helix 1 of α-actinin. The view is rotated 180°around a vertical axis from FIG. 1A.

[0098]FIGS. 2A, 2B and 2C provide a detailed view of the conserved saltbridge linking switch I and switch II as a result of purifying,crystallizing a recombinant protein of the invention and finally solvingthe structure of that protein according to methods of the invention. Theconserved nucleotide binding/sensing elements found in all myosins,kinesins, and G-proteins include the P-loop, switch I, and switch II.

[0099]FIG. 2A shows the structure of Dictyostelium myosin II motorcomplexed with Mg-ADP-BeF₃. As in Mg-ADP-VO₄ (Smith and Rayment, 1996a)and Mg-ADP-BeF₃ (Dominguez et al., 1998) structures, switch I and switchII are closed. The conserved salt bridge between residues R238 and E459is shown as a ball-and-stick model surrounded by 2.6 Å experimental2f_(o)-f_(c) electron density (wireframe), contoured at 1σ. As expectedfor a salt bridge, the electron density is continuous between theresidues, which point toward each other.

[0100]FIG. 2B is the same region as observed in the crystal structure ofM761-2R-R238E. The electron density was calculated from a model withalanins at positions 238 and 459 in order to eliminate model bias.Electron density for two glutamic acid residues is clearly visible, butthe side chain of E238 now points away from E459 and the switch II loophas moved away from switch I.

[0101]FIG. 2C again illustrates the same region showing a superpositionof the M761-2R-R238E structure with a structure of Dictyostelium, myosinII motor complexed with Mg-ADP-VO₄ (PDB code IVOM) (Smith and Rayment,1996a). The nucleotide and R238-E459 salt bridge are shown asball-and-stick models. Both the P-loop and switch I regions are inessentially identical conformations in both structures. However, theswitch II region shifts to the right, toward the nucleotide, by ˜Å inthe Mg-ADP-VO₄ structure, allowing the formation of the R238-E459 saltbridge.

[0102]FIG. 3 shows the orientation of the myosin lever arm, a segment ofcomponent (1) of an example for an recombinant protein of the invention.Shown are five molecules of actin making up part of a helical actinfilament. Modeled onto this structure are myosin in the “pre-powerstroke” up/closed orientation, the “post-power stroke” down/openorientation, and the M761-2R-R238E structure. First, the up, down, andactomyosin complex structures were modeled, and the M761-2R-R238Estructure was then aligned to the core domain of the down/open structurevia residues 160-200, which includes the highly conserved P-loop region.It is noted that in the M761-2R-R238E structure, the helix leaving theconverter domain initially superposes with the down/open structure, butthen deviates due to the different helical bend of the α-actinin.

[0103]FIGS. 4A and 4B depict the structure of α-actinin repeats 1 and 2.α-Actinin is an example for component (2) of the recombinant protein ofthe present invention, which means α-actinin is the target protein inthis example. Its structure was solved using purification andcrystallization methods of the present invention.

[0104]FIG. 4A shows an α-carbon chain trace of the 6 helices making uprepeats 1 (helices labeled 1-3) and 2 (helices labeled 4-6). The 17hydrophobic aromatic amino acid residues stabilizing the triple-helicalpacking include 7 tyrosines, 6 phenylalanines and 4 tryptophans. Shownalso are two adjacent proline residues, which cause a kink, but not abreak in α-helix 2 of repeat 1. The uninterrupted α-helix linkingrepeats 1 and 2 is shown.

[0105]FIG. 4B is a detailed view of the linker region, highlighting thestabilizing hydrophobic and hydrogen bonding interactions. Orientationis identical to that in FIG. 4A. Side chains are shown as ball-and-stickmodels, with the exception of Asp796 and Ser797, in which only theα-carbon atoms involved in hydrophobic contacts are shown for clarity.The salt bridge between Arg880 and Glu877, and the hydrogen bond betweenArg880 and the carbonyl oxygen of Leu956 (also shown as a ball-and-stickmodel), are shown as dashed lines.

[0106]FIGS. 5A and 5B provide a comparison of Dictyostelium α-actininwith human α-actinin and human α-spectrin.

[0107]FIG. 5A shows the overlapping repeat 2 region of Dictyostelium andhuman α-actinin as ribbon diagrams. Helices are numbered as describedabove for Dictyostelium α-actinin and, in parentheses, as describedpreviously for human α-actinin (Djinovic-Carugo et al., 1999). Thelargest differences occur in the loop region connecting helices 4 and 5,indicated by an arrow, where human α-actinin would seriously overlapwith Dictyostelium helix 6.

[0108]FIG. 5B shows the alignment of Dictyostelium repeat 2 with repeat16 human α-spectrin as ribbon diagrams. Helices are numbered asdescribed above for the Dictyostelium protein and, in parentheses, asdescribed previously for the human protein (Gram et al., 1999).Dictyostelium helix 4 and α-spectrin helix A are in the background. Ingeneral, the two structures align more closely than thehuman/Dictyostelium alignment described in FIG. 5A. The largestdifference occurs in the loop region connecting helices 5 and 6,indicated by an arrow, where the human α-spectrin structure is moved inrespect to the Dictyostelium α-actinin structure as a result of aproline-induced kink in helix B.

[0109]FIG. 6 provides the amino acid sequence (one-letter-code) forcomponent (1) of recombinant protein M761-2R-R238E, exemplifying arecombinant protein of the invention. This sequence is furtherillustrated as attached SEQ ID NO. 1.

[0110]FIG. 7 provides the whole sequence of recombinant proteinM761-2R-R238E comprising as component (1) the amino acid sequence of themyosin II motor domain of Dictyostelium, a three amino acid linkerregion (LGS) as component (3) and the α-actinin amino acid sequencebeing the target sequence (component (2)) in this example(one-letter-code). This sequence is further illustrated as attached SEQID NO. 2.

[0111]FIG. 8 is the DNA sequence coding for recombinant proteinM761-2R-R238E such that the sequence of FIG. 8 corresponds to thesequence of FIG. 7 on the genetic level. This sequence is furtherillustrated as attached SEQ ID NO. 3.

DETAILED DESCRIPTION OF THE INVENTION Example 1

[0112] (a) Expression

[0113] The expression-vector pDXA-3H, that was used for the productionof M761-2R-R238E, carries the origin of replication of the Dictyosteliumhigh copy number plasmid Ddp2 (Leiting et al., Molecular And CellularBiology, 10:3727-3736, 1990; Chang et al., Nucleic Acids Research,17:3655-3661, 1990), an expression cassette consisting of the strong,constitutive actin15 promoter, a translational start codon upstream froma multiple cloning site (MCS), and sequences for the addition of ahistidine octamer at the carboxy terminus of any protein. Plasmidsderived from pDXA-3H were transformed into orf⁺-cells. These cells carryseveral integrated copies of the rep gene which is essential in transfor the replication of plasmids that carry the Ddp2 origin (Leiting etal., Molecular And Cellular Biology, 10:3727-3736, 1990; Slade et al.,Plasmid, 24:195-207, 1990). The myosin-α-actinin fusion was created bylinking codon 761 of the Dictyostelium mhcA gene to codon 264 of theDictyostelium α-actinin gene.

[0114] The resulting construct pDH12-2R extended to codon 505 of theα-actinin gene. Plasmid pDH20 was generated by insertion of the first765 codons of Dictyostelium myosin II into the MCS of pDXA-3H (Furch etal., Biochemistry, 37:6317-6326, 1988). Site directed mutagenesis wasused to generate plasmid pDH20(R238E) encoding a motor domain fragmentwith the single point mutation R238E. Replacement of the 2 kb SafI-BstXIfragment of pDH12-2R with the corresponding fragment from pDH20(R23BE)was used to generate the expression vector for the production ofM761-2R-R238E, the fusion protein and an example for a recombinantprotein of the invention, thus containing both a point mutation in theactive site and a C-terminal extension consisting of two α-actininrepeats.

[0115] (b) Purification

[0116] The overexpressed protein was purified by Ni²⁺-chelate affinitychromatography as described by Manstein and Hunt, J. Muscle R. CellMotil., 6:325 1995 and Manstein et al., Gene, 162:129, 1995. The entirecontents of each of which is incorporated herein by reference.

[0117] Cells expressing the histidine octamer tagged fusion protein weregrown in 5 L flasks containing 2.5 L DD-Broth 20. DD-Broth 20 contains(per liter): 20 g protease peptone (Oxoid), 7 g yeast extract (Oxoid), 8g glucose, 0.33 g Na₂HPO₄.7H₂O, and 0.35 g KH₂PO₄. The flasks wereincubated on a gyratory shaker at 200 rpm and 21° C. Cells wereharvested at a density of 6×10⁶ ml⁻¹ by centrifugation for 7 min at2,700 rpm in a Beckman J-6 centrifuge and washed once in PBS. The wetweight of the resulting cell pellet was determined. Typically, 35 g wereobtained from a 15 L shaking culture. The cells were resuspended in 140ml of Lysis Buffer (50 mM Tris-HCl, pH 8.0, 2 mM EDTA, 0.2 mM EGTA, 1 mMdithiothreitol (DTT), 5 mM benzamidine, 40 mg/ml TLCK, 20 mg/mlN-tosyl-L-phenylalanine chloromethyl ketone (TPCK), 200 mMphenylmethylsulfonyl fluoride (PMSF) and 0.04% NaN₃).

[0118] Cell lysis was induced by the addition of 70 ml of Lysis Buffercontaining 1% Triton-X®100, 15 mg/ml RNaseA (Sigma) and 100 units ofalkaline phosphatase. The lysate was incubated on ice for one hour. Uponcentrifugation (230,000 g, 1 hour), the recombinant protein remained inthe pellet. The pellet was washed in 100 ml of HKM buffer (50 mM HEPES,pH 7.3, 30 mM KAc, 10 mM MgSO₄, 7 mM b-mercaptoethenol, 5 mMbenzamidine, 40 mg/ml PMSF) and centrifuged for 45 min at 230,000 g. Therecombinant protein was released into the supernatant by extraction ofthe resulting pellet with 60 ml HKM buffer containing 10 mM ATP. Aftercentrifugation (500,000 g, 45 min.), the supernatant was loaded using aperistaltic pump onto a Ni²⁺-nitrilotriacetic acid (Ni²⁺-NTA) affinitycolumn (1.5×10 cm) (Qiagen). The flow-rate was adjusted to approximately3 ml min⁻¹. After loading was completed the column was connected to aWaters 650M chromatography system. The column was washed briefly in LowSalt buffer (50 mM HEPES, pH 7.3, 30 mM KAc, 3 mM benzamidine), HighSalt buffer (as Low Salt Buffer, but with 300 mM KAc), and Low SaltBuffer containing 50 mM imidazole. The recombinant myosin was elutedusing a linear gradient of Low Salt Buffer and Imidazole Buffer (0.5 Mimidazole, pH 7.3, 3 mM benzamidine), starting with 10% Imidazole Bufferand reaching 100% after 15 minutes. The flow rate was 3 ml min⁻¹ and 3ml fractions were collected. Absorbance at 280 nm was monitored. SDSgels were run to check the purity of the eluted protein.

[0119] The pooled fractions were dialyzed immediately against storagebuffer (20 mM HEPES, 0.5 mM EDTA, 1 mM DTT, pH 7.0) containing 3%sucrose and the purified protein could be stored at −80° C. for severalmonths without apparent loss of enzymatic activity. Actin-activatedATPase activity was measured by the release of inorganic phosphate.

[0120] (c) Crystallization

[0121] Crystals of the overexpressed and purified recombinant proteinM761-2R-R238E were grown by the hanging drop method at 7° C. The dropscontained equal volumes (2.2 μl) of the protein solution and the motherliquor. The mother liquor contained 12% PEGM 5K, 170 mM NaCl, 50 mMHEPES-NaOH pH 7.2, 5 mM MgCl₂, 5 mM DTT, 0.5 mM EGTA and 2%2-methyl-1,3-propanediol. The protein solution (5 mg/ml) containedadditionally 200 μM ADP and 200 μM vanadate, and was incubated on icefor 1 h before setting up the drops. Crystals normally appeared after7-8 days and reached maximum dimensions of 0.1×0.3×0.9 mm. Crystals weretransferred to a solution of mother liquor plus 30% glycerol and frozenin liquid nitrogen for storage and data collection.

[0122] (d) Crystallography and structure refinement

[0123] Diffraction data for the crystals of the recombinant proteinM761-2R-R238E were collected at ESRF beamline ID-13 on a MarCCD detectorand integrated and scaled using the program XDS (Kabsch, J. Appl.Cryst., 26:795, 1993), producing a data set 97.7% complete to 2.8 Å with4-fold redundancy and an R_(sym) of 11.0%. The M761-2R-R238E crystalsbelonged to space group P2,2,2 with two molecules in the asymmetricunit. Molecular replacement was performed with the program AMoRe(Navaza, Acta Cryst. A, 50:57, 1994) using the crystal structure ofDictyostelium myosin resides 2-759 complexed with Mg-ADP-BeF_(x) (PDBcode lmmd) (Fisher et al., Biochemistry, 34:8960, 1995) as a startingmodel (the nucleotide and the side chains beyond Cβ of residues 238 and459 were excluded).

[0124] Initial maps showed clear helical density for the first repeat ofthe α-actinin lever arm, which was built as a poly-alanine model usingthe program O (7.0 for WindowsNT), Jones et al., “Improved methods forbuilding protein models in electron density maps and the location oferrors in these models,” Acta Crystallogr. A, 47L110-119, 1991.Following several rounds of simulated annealing refinement usingtorsional dynamics and a maximum likelihood target with the program CNSv0.9a (Brünger et al., 1998, Acta Cryst. D, 54:905), the secondα-actinin repeat was visible and built. Subsequent rounds of modelbuilding and refinement (including bulk solvent correction) produced thefinal structure of two M761-2R-R238E molecules containing 1005 residueseach, two molecules of Mg-ADP and 14 water molecules (R-factor, 24.1%;R_(free), 29.9%). Ramachandran analysis shows all nonglycine residues tobe in allowed regions. Figures were made using the programs Bobscript(Esnouf, J. Mol. Graph. Model., 15:132, 1997) and Raster3D (Merritt andBacon, Methods Enzymol., 277:505, 1997).

[0125] In contemplation of the principles of the present invention,reference is made to Niemann et al., “Crystal structure of a dynaminGTPase domain in both nucleotide-free and GDP-bound forms,” EMBOJournal, 20:5813-5821, 2001 and Kliche et al., “Structure of agenetically engineered molecular motor.” EMBO Journal, 20:40-46, 2001.TABLE 1 Structure Thrombin Protein Fusion Partner Transformed ExpressedPurified Crystallized Solved Cleavage M761-2R Repeats 1 and 2 ofD.discoideum α-actinin Y Y Y Y Y X M765-DymA2-316 D.discoideum Dynamin Aresidues 2-316 Y Y Y Y Y N M765-DymA2-490 D.discoideum Dynamin Aresidues 2-490 Y Y N X X X M765-SSF D.discoideum SSF protein - unknownfunction Y Y Y N X Y M765-Mark1-A2 Mammalian mutant Mark1 kinase domainY Y Y N X Y M765-Mark2 Mammalian Mark1 kinase domain Y Y X X X XM765-Kif2c632 Mammalian kinesin related protein domain Y Y X X X XM765-Kif2-md Mammalian kinesin related protein domain Y Y X X X XM765-kiffuII Mammalian kinesin related complete protein Y X X X X XM765-jmjmC D.discoideum universal transcriptional regulator Y X X X X XM765-p27 P27 protein Y N X X X X M765-HsCor1A Human coronin protein Y NX X X X M765-DdCor D.discoideum coronin protein Y Y X X X X M765-IRT1Arabadopsis metal transport protein Y N X X X X M765-DdNck2 D.discoideumNck protein Y N X X X X

Example 2

[0126] Myosin-Fusion-System for isolating interacting proteins/proteinbinding partners

[0127] (a) Preparation

[0128] In order to demonstrate the function of the myosin-fusion-systema library of cDNA was fused to the C-terminus of an MMD and expressed inDictyostelium or another eukaryotic system. Clonal transformants wereprobed with the bait-protein of choice fused to β-galactosidase. The MMDwas His- or epitope-tagged at the N-terminus.

[0129] Experimentally, cells were transformed with the MMD-cDNA libraryand clones were grown and kept in 96 well plates. The bait-β-gal fusionprotein was transformed in Dictyostelium orf⁺ cells and grown in anappropriate quantity (1 clonal cell line). Upon reaching confluence, theMMD-cDNA clones in the 96 well plates were washed once in the plateswith PBS and then lysed by adding lysis buffer containing Triton X®-100(or, alternatively, NP-40), at the same time the ATP pool was depletedby the addition of alkaline phosphatase. The actin-based cytoskeletonwith all myosin and also the M765-fusion-proteins were pelleted bycentrifugation and washed with lysis buffer. The myosin was releasedfrom the pellets by the addition of Mg²⁺-ATP. The ATP-unsoluble fractionwas pelleted and the supernatant transferred to 96 well plates coatedwith Ni-NTA. The His-tagged products of the MMD-cDNA were shown to bindto these plates. After extensive washing, the coated plates wereincubated with the bait-β-gal construct. Again, after extensive washing,the plates were incubated with a substrate for β-gal, in this case CPRG(red color OD₅₇₄) or ONPG (yellow OD₄₁₅), and the β-gal activity wasdetermined with a microtiter plate reader. High β-gal activity indicateda strong interaction between the bait and the product of the targetcDNA.

[0130] The selected clones were then recovered from the original 96 wellplates. The MMD-cDNA-clone was expressed in and purified fromDictyostelium by standard MMD purification. For further biochemical andstructural characterization, the isolated gene product was eithercleaved with an appropriate protease to release it from the MMD or wasused directly in the fusion form for kinetics or crystallizationexperiments.

[0131] (b) Interaction Test

[0132] The method of the invention was tested by expressing MMD-RaclAand DRG-2D-β-gal (the DRG-2D construct acts as an exchange factor forthe small G-protein RaclA).

[0133] The MMD-RaclA cells were cloned, grown in 96 well plates, washed,lysed and ATP extracted as described above. The Ni-NTA coated plateswere then incubated with the ATP-released protein fraction. The cellsexpressing the DRG-2D-β-gal were grown in shaking suspension and washedand lysed under the same conditions. The DRG-2D-β-gal supernatant wasincubated at different dilutions. As control wells were incubatedwithout bait (DRG-2D-β-gal) or without MMD-RaclA or with MMD alone, allcontrols were negative after staining for β-gal, whereas the incubationswith immobilized MMD-RaclA and the bait gave a signal, which wasdependent on the concentration of added bait.

[0134] In conclusion, the interaction between DRG-2D and RaclA was shownby the method of the invention, whereas it could not be shown when usingthe yeast-two-hybrid system. Therefore, the method of the invention hasdefinite advantages over the yeast-two-hybrid system or other knowntechniques developed to identify protein-protein interactions.

[0135] This invention has been described in terms of specificembodiments, set forth in detail. It should be understood, however, thatthese embodiments are presented by way of illustration only, and thatthe invention is not necessarily limited thereto.

1 3 1 765 PRT Artificial Sequence Description of Artificial SequencePartial myosin sequence of Dictyostelium; Component (1) of therecombinant protein M761-2R R238E 1 Met Asp Gly Thr Glu Asp Pro Ile HisAsp Arg Thr Ser Asp Tyr His 1 5 10 15 Lys Tyr Leu Lys Val Lys Gln GlyAsp Ser Asp Leu Phe Lys Leu Thr 20 25 30 Val Ser Asp Lys Arg Tyr Ile TrpTyr Asn Pro Asp Pro Lys Glu Arg 35 40 45 Asp Ser Tyr Glu Cys Gly Glu IleVal Ser Glu Thr Ser Asp Ser Phe 50 55 60 Thr Phe Lys Thr Val Asp Gly GlnAsp Arg Gln Val Lys Lys Asp Asp 65 70 75 80 Ala Asn Gln Arg Asn Pro IleLys Phe Asp Gly Val Glu Asp Met Ser 85 90 95 Glu Leu Ser Tyr Leu Asn GluPro Ala Val Phe His Asn Leu Arg Val 100 105 110 Arg Tyr Asn Gln Asp LeuIle Tyr Thr Tyr Ser Gly Leu Phe Leu Val 115 120 125 Ala Val Asn Pro PheLys Arg Ile Pro Ile Tyr Thr Gln Glu Met Val 130 135 140 Asp Ile Phe LysGly Arg Arg Arg Asn Glu Val Ala Pro His Ile Phe 145 150 155 160 Ala IleSer Asp Val Ala Tyr Arg Ser Met Leu Asp Asp Arg Gln Asn 165 170 175 GlnSer Leu Leu Ile Thr Gly Glu Ser Gly Ala Gly Lys Thr Glu Asn 180 185 190Thr Lys Lys Val Ile Gln Tyr Leu Ala Ser Val Ala Gly Arg Asn Gln 195 200205 Ala Asn Gly Ser Gly Val Leu Glu Gln Gln Ile Leu Gln Ala Asn Pro 210215 220 Ile Leu Glu Ala Phe Gly Asn Ala Lys Thr Thr Arg Asn Asn Asn Ser225 230 235 240 Ser Arg Phe Gly Lys Phe Ile Glu Ile Gln Phe Asn Ser AlaGly Phe 245 250 255 Ile Ser Gly Ala Ser Ile Gln Ser Tyr Leu Leu Glu LysSer Arg Val 260 265 270 Val Phe Gln Ser Glu Thr Glu Arg Asn Tyr His IlePhe Tyr Gln Leu 275 280 285 Leu Ala Gly Ala Thr Ala Glu Glu Lys Lys AlaLeu His Leu Ala Gly 290 295 300 Pro Glu Ser Phe Asn Tyr Leu Asn Gln SerGly Cys Val Asp Ile Lys 305 310 315 320 Gly Val Ser Asp Ser Glu Glu PheLys Ile Thr Arg Gln Ala Met Asp 325 330 335 Ile Val Gly Phe Ser Gln GluGlu Gln Met Ser Ile Phe Lys Ile Ile 340 345 350 Ala Gly Ile Leu His LeuGly Asn Ile Lys Phe Glu Lys Gly Ala Gly 355 360 365 Glu Gly Ala Val LeuLys Asp Lys Thr Ala Leu Asn Ala Ala Ser Thr 370 375 380 Val Phe Gly ValAsn Pro Ser Val Leu Glu Lys Ala Leu Met Glu Pro 385 390 395 400 Arg IleLeu Ala Gly Arg Asp Leu Val Ala Gln His Leu Asn Val Glu 405 410 415 LysSer Ser Ser Ser Arg Asp Ala Leu Val Lys Ala Leu Tyr Gly Arg 420 425 430Leu Phe Leu Trp Leu Val Lys Lys Ile Asn Asn Val Leu Cys Gln Glu 435 440445 Arg Lys Ala Tyr Phe Ile Gly Val Leu Asp Ile Ser Gly Phe Glu Ile 450455 460 Phe Lys Val Asn Ser Phe Glu Gln Leu Cys Ile Asn Tyr Thr Asn Glu465 470 475 480 Lys Leu Gln Gln Phe Phe Asn His His Met Phe Lys Leu GluGln Glu 485 490 495 Glu Tyr Leu Lys Glu Lys Ile Asn Trp Thr Phe Ile AspPhe Gly Leu 500 505 510 Asp Ser Gln Ala Thr Ile Asp Leu Ile Asp Gly ArgGln Pro Pro Gly 515 520 525 Ile Leu Ala Leu Leu Asp Glu Gln Ser Val PhePro Asn Ala Thr Asp 530 535 540 Asn Thr Leu Ile Thr Lys Leu His Ser HisPhe Ser Lys Lys Asn Ala 545 550 555 560 Lys Tyr Glu Glu Pro Arg Phe SerLys Thr Glu Phe Gly Val Thr His 565 570 575 Tyr Ala Gly Gln Val Met TyrGlu Ile Gln Asp Trp Leu Glu Lys Asn 580 585 590 Lys Asp Pro Leu Gln GlnAsp Leu Glu Leu Cys Phe Lys Asp Ser Ser 595 600 605 Asp Asn Val Val ThrLys Leu Phe Asn Asp Pro Asn Ile Ala Ser Arg 610 615 620 Ala Lys Lys GlyAla Asn Phe Ile Thr Val Ala Ala Gln Tyr Lys Glu 625 630 635 640 Gln LeuAla Ser Leu Met Ala Thr Leu Glu Thr Thr Asn Pro His Phe 645 650 655 ValArg Cys Ile Ile Pro Asn Asn Lys Gln Leu Pro Ala Lys Leu Glu 660 665 670Asp Lys Val Val Leu Asp Gln Leu Arg Cys Asn Gly Val Leu Glu Gly 675 680685 Ile Arg Ile Thr Arg Lys Gly Phe Pro Asn Arg Ile Ile Tyr Ala Asp 690695 700 Phe Val Lys Arg Tyr Tyr Leu Leu Ala Pro Asn Val Pro Arg Asp Ala705 710 715 720 Glu Asp Ser Gln Lys Ala Thr Asp Ala Val Leu Lys His LeuAsn Ile 725 730 735 Asp Pro Glu Gln Tyr Arg Phe Gly Ile Thr Lys Ile PhePhe Arg Ala 740 745 750 Gly Gln Leu Ala Arg Ile Glu Glu Ala Arg Glu GlnArg 755 760 765 2 1016 PRT Artificial Sequence Description of ArtificialSequence Whole sequence of recombinant protein M761-2R R238 E 2 Met AspGly Thr Glu Asp Pro Ile His Asp Arg Thr Ser Asp Tyr His 1 5 10 15 LysTyr Leu Lys Val Lys Gln Gly Asp Ser Asp Leu Phe Lys Leu Thr 20 25 30 ValSer Asp Lys Arg Tyr Ile Trp Tyr Asn Pro Asp Pro Lys Glu Arg 35 40 45 AspSer Tyr Glu Cys Gly Glu Ile Val Ser Glu Thr Ser Asp Ser Phe 50 55 60 ThrPhe Lys Thr Val Asp Gly Gln Asp Arg Gln Val Lys Lys Asp Asp 65 70 75 80Ala Asn Gln Arg Asn Pro Ile Lys Phe Asp Gly Val Glu Asp Met Ser 85 90 95Glu Leu Ser Tyr Leu Asn Glu Pro Ala Val Phe His Asn Leu Arg Val 100 105110 Arg Tyr Asn Gln Asp Leu Ile Tyr Thr Tyr Ser Gly Leu Phe Leu Val 115120 125 Ala Val Asn Pro Phe Lys Arg Ile Pro Ile Tyr Thr Gln Glu Met Val130 135 140 Asp Ile Phe Lys Gly Arg Arg Arg Asn Glu Val Ala Pro His IlePhe 145 150 155 160 Ala Ile Ser Asp Val Ala Tyr Arg Ser Met Leu Asp AspArg Gln Asn 165 170 175 Gln Ser Leu Leu Ile Thr Gly Glu Ser Gly Ala GlyLys Thr Glu Asn 180 185 190 Thr Lys Lys Val Ile Gln Tyr Leu Ala Ser ValAla Gly Arg Asn Gln 195 200 205 Ala Asn Gly Ser Gly Val Leu Glu Gln GlnIle Leu Gln Ala Asn Pro 210 215 220 Ile Leu Glu Ala Phe Gly Asn Ala LysThr Thr Arg Asn Asn Asn Ser 225 230 235 240 Ser Arg Phe Gly Lys Phe IleGlu Ile Gln Phe Asn Ser Ala Gly Phe 245 250 255 Ile Ser Gly Ala Ser IleGln Ser Tyr Leu Leu Glu Lys Ser Arg Val 260 265 270 Val Phe Gln Ser GluThr Glu Arg Asn Tyr His Ile Phe Tyr Gln Leu 275 280 285 Leu Ala Gly AlaThr Ala Glu Glu Lys Lys Ala Leu His Leu Ala Gly 290 295 300 Pro Glu SerPhe Asn Tyr Leu Asn Gln Ser Gly Cys Val Asp Ile Lys 305 310 315 320 GlyVal Ser Asp Ser Glu Glu Phe Lys Ile Thr Arg Gln Ala Met Asp 325 330 335Ile Val Gly Phe Ser Gln Glu Glu Gln Met Ser Ile Phe Lys Ile Ile 340 345350 Ala Gly Ile Leu His Leu Gly Asn Ile Lys Phe Glu Lys Gly Ala Gly 355360 365 Glu Gly Ala Val Leu Lys Asp Lys Thr Ala Leu Asn Ala Ala Ser Thr370 375 380 Val Phe Gly Val Asn Pro Ser Val Leu Glu Lys Ala Leu Met GluPro 385 390 395 400 Arg Ile Leu Ala Gly Arg Asp Leu Val Ala Gln His LeuAsn Val Glu 405 410 415 Lys Ser Ser Ser Ser Arg Asp Ala Leu Val Lys AlaLeu Tyr Gly Arg 420 425 430 Leu Phe Leu Trp Leu Val Lys Lys Ile Asn AsnVal Leu Cys Gln Glu 435 440 445 Arg Lys Ala Tyr Phe Ile Gly Val Leu AspIle Ser Gly Phe Glu Ile 450 455 460 Phe Lys Val Asn Ser Phe Glu Gln LeuCys Ile Asn Tyr Thr Asn Glu 465 470 475 480 Lys Leu Gln Gln Phe Phe AsnHis His Met Phe Lys Leu Glu Gln Glu 485 490 495 Glu Tyr Leu Lys Glu LysIle Asn Trp Thr Phe Ile Asp Phe Gly Leu 500 505 510 Asp Ser Gln Ala ThrIle Asp Leu Ile Asp Gly Arg Gln Pro Pro Gly 515 520 525 Ile Leu Ala LeuLeu Asp Glu Gln Ser Val Phe Pro Asn Ala Thr Asp 530 535 540 Asn Thr LeuIle Thr Lys Leu His Ser His Phe Ser Lys Lys Asn Ala 545 550 555 560 LysTyr Glu Glu Pro Arg Phe Ser Lys Thr Glu Phe Gly Val Thr His 565 570 575Tyr Ala Gly Gln Val Met Tyr Glu Ile Gln Asp Trp Leu Glu Lys Asn 580 585590 Lys Asp Pro Leu Gln Gln Asp Leu Glu Leu Cys Phe Lys Asp Ser Ser 595600 605 Asp Asn Val Val Thr Lys Leu Phe Asn Asp Pro Asn Ile Ala Ser Arg610 615 620 Ala Lys Lys Gly Ala Asn Phe Ile Thr Val Ala Ala Gln Tyr LysGlu 625 630 635 640 Gln Leu Ala Ser Leu Met Ala Thr Leu Glu Thr Thr AsnPro His Phe 645 650 655 Val Arg Cys Ile Ile Pro Asn Asn Lys Gln Leu ProAla Lys Leu Glu 660 665 670 Asp Lys Val Val Leu Asp Gln Leu Arg Cys AsnGly Val Leu Glu Gly 675 680 685 Ile Arg Ile Thr Arg Lys Gly Phe Pro AsnArg Ile Ile Tyr Ala Asp 690 695 700 Phe Val Lys Arg Tyr Tyr Leu Leu AlaPro Asn Val Pro Arg Asp Ala 705 710 715 720 Glu Asp Ser Gln Lys Ala ThrAsp Ala Val Leu Lys His Leu Asn Ile 725 730 735 Asp Pro Glu Gln Tyr ArgPhe Gly Ile Thr Lys Ile Phe Phe Arg Ala 740 745 750 Gly Gln Leu Ala ArgIle Glu Glu Ala Arg Glu Gln Arg Leu Gly Ser 755 760 765 Glu Gln Thr LysSer Asp Tyr Leu Lys Arg Ala Asn Glu Leu Val Gln 770 775 780 Trp Ile AsnAsp Lys Gln Ala Ser Leu Glu Ser Arg Asp Phe Gly Asp 785 790 795 800 SerIle Glu Ser Val Gln Ser Phe Met Asn Ala His Lys Glu Tyr Lys 805 810 815Lys Thr Glu Lys Pro Pro Lys Gly Gln Glu Val Ser Glu Leu Glu Ala 820 825830 Ile Tyr Asn Ser Leu Gln Thr Lys Leu Arg Leu Ile Lys Arg Glu Pro 835840 845 Phe Val Ala Pro Ala Gly Leu Thr Pro Asn Glu Ile Asp Ser Thr Trp850 855 860 Ser Ala Leu Glu Lys Ala Glu Gln Glu His Ala Glu Ala Leu ArgIle 865 870 875 880 Glu Leu Lys Arg Gln Lys Lys Ile Ala Val Leu Leu GlnLys Tyr Asn 885 890 895 Arg Ile Leu Lys Lys Leu Glu Asn Trp Ala Thr ThrLys Ser Val Tyr 900 905 910 Leu Gly Ser Asn Glu Thr Gly Asp Ser Ile ThrAla Val Gln Ala Lys 915 920 925 Leu Lys Asn Leu Glu Ala Phe Asp Gly GluCys Gln Ser Leu Glu Gly 930 935 940 Gln Ser Asn Ser Asp Leu Leu Ser IleLeu Ala Gln Leu Thr Glu Leu 945 950 955 960 Asn Tyr Asn Gly Val Pro GluLeu Thr Glu Arg Lys Asp Thr Phe Phe 965 970 975 Ala Gln Gln Trp Thr GlyVal Lys Ser Ser Ala Glu Thr Tyr Lys Asn 980 985 990 Thr Leu Leu Ala GluLeu Glu Arg Leu Gln Lys Ile Glu Asp Ala Leu 995 1000 1005 His His HisHis His His His His 1010 1015 3 3048 DNA Artificial Sequence Descriptionof Artificial Sequence DNA sequence coding for recombinant proteinM761-2R R238E 3 atggatggta ccgaggatcc aattcatgat agaacttcag attatcacaaatacttaaaa 60 gttaaacaag gtgattctga tttatttaaa cttactgttt cagataagagatacatttgg 120 tataatccag atccaaaaga aagagattca tatgaatgtg gtgaaattgtttcagaaacc 180 tctgattctt tcacattcaa aaccgttgat ggtcaagaca gacaagtcaaaaaggatgat 240 gccaatcaac gtaatccaat caaattcgat ggtgtcgaag atatgtctgaattatcatac 300 ctcaatgaac cagcagtttt ccacaatctc cgtgttcgtt acaatcaagatttaatttac 360 acctattcag gtctcttttt ggttgccgtc aatccattca agagaattccaatctacact 420 caagagatgg ttgatatctt caaaggtcgt agaagaaatg aagttgccccacatattttc 480 gccatttctg atgttgccta tcgttcaatg ttagatgatc gtcaaaatcaatcactctta 540 atcactggtg aatctggtgc tggtaagact gaaaacacca aaaaggtcattcaatatctt 600 gcatctgtcg ctggtcgtaa tcaagccaat ggtagtggtg tattggaacaacaaattctc 660 caagccaatc caatccttga agcttttggt aatgccaaaa ccacccgtaacaacaattca 720 tctcgtttcg gtaaattcat tgaaattcaa ttcaacagtg ctggtttcattagtggtgct 780 tcaattcaat cctacctttt agagaaatca cgtgtcgttt tccaatctgaaaccgaacgt 840 aattatcaca ttttctatca actcttagct ggtgccaccg ccgaagaaaagaaagctctt 900 cacttggctg gtccagaatc attcaactac ttaaatcaaa gtggttgtgttgatatcaaa 960 ggtgtctctg atagtgaaga attcaaaatc actcgtcaag ctatggacattgttggtttc 1020 tcacaagaag aacaaatgtc aatctttaag atcattgctg gtatcttacatttaggtaac 1080 atcaaattcg aaaaaggtgc tggtgaaggt gctgtcctca aagacaaaaccgccctcaac 1140 gctgcttcaa ccgtctttgg tgtcaatcca tcagtccttg aaaaggctctcatggaacca 1200 cgtattttag ccggtcgtga tttagttgct caacatctca acgttgaaaaatcctcatca 1260 tcaagagacg ctcttgtcaa agctctctat ggtcgtcttt tcctctggttggtcaaaaag 1320 atcaacaatg tcctctgtca agagagaaaa gcttacttta ttggtgttttggatatttca 1380 ggttttgaaa ttttcaaagt caattcattc gaacaattat gtatcaattataccaatgaa 1440 aaactccaac aattcttcaa tcaccatatg ttcaaattgg aacaagaagaatatcttaaa 1500 gagaaaatca attggacttt catcgatttt ggtcttgatt cacaagccactatcgattta 1560 attgatggtc gtcaaccacc aggtatttta gctcttttgg atgaacaatctgttttccca 1620 aatgccaccg ataatacttt aatcaccaaa ctccacagtc actttagcaagaagaacgcc 1680 aaatacgaag aaccacgttt ctccaaaacc gaatttggtg ttacccattatgctggtcaa 1740 gtcatgtatg agattcaaga ttggttagaa aagaacaaag atccattacaacaagatctc 1800 gaactttgct tcaaagattc atcagacaac gttgtcacca aacttttcaatgatccaaac 1860 attgccagtc gtgcaaagaa aggtgcaaac tttatcactg tcgccgctcaatacaaggaa 1920 caattagcct cactcatggc nacccttgaa accaccaacc cacatttcgttcgttgtatc 1980 attccaaaca acaaacaatt accagccaaa ctcgaagata aagttgtcctcgaccaatta 2040 cgttgcaatg gtgtcctcga aggtattcgt attactcgta aaggtttcccaaatcgtatt 2100 atctatgccg atttcgtcaa acgttactat ttattagctc caaacgttccaagagacgct 2160 gaagactcac aaaaagccac cgatgctgtt ctcaaacatc ttaacattgatccagaacaa 2220 tatcgtttcg gtatcaccaa gattttcttc cgtgccggtc aattagctcgtattgaagaa 2280 gctcgtgaac aacgtctagg atccgaacaa accaaatctg attatcttaaaagagccaat 2340 gaactcgttc aatggattaa cgataaacaa gcatcacttg aatcacgtgattttggtgat 2400 tccatcgaat ctgttcaaag tttcatgaac gctcataaag aatataaaaaaaccgaaaaa 2460 ccaccaaagg gtcaagaagt ctctgaattg gaagctatct acaattcattacaaactaaa 2520 ttacgtttaa ttaaacgtga accatttgtt gcaccagctg gtctcactccaaatgaaatc 2580 gattccactt ggtccgcttt agagaaagct gaacaagaac atgctgaagccctccgtatt 2640 gaactcaaac gtcaaaagaa aattgcagtt ctcttacaaa aatacaatcgtattctcaag 2700 aaactcgaaa actgggccac caccaaatct gtctacctcg gttccaatgaaaccggtgac 2760 agtatcactg ctgttcaagc taaattaaag aatttagaag cttttgatggtgaatgtcaa 2820 tcattggaag gtcaatcaaa ctctgatctc ctcagcattc ttgctcaattaactgaactc 2880 aactacaatg gtgtaccaga actcactgaa cgtaaagata cattctttgctcaacaatgg 2940 actggtgtta aatcatctgc tgaaacctac aaaaacactc ttttagctgaacttgaaaga 3000 ctccaaaaga ttgaagatgc attacatcat catcatcatc atcaccac3048

We claim:
 1. A recombinant protein comprising: (a) a first protein, or an analog, fragment or derivative thereof; and (b) a target protein of interest.
 2. The recombinant protein of claim 1, further comprising: (c) a linker between (a) and (b).
 3. The recombinant protein of claim 2, wherein (c) comprises at least 2 amino acids.
 4. The recombinant protein of claim 1, wherein (b) is any amino acid sequence of at least 20 amino acids.
 5. The recombinant protein of claim 4, wherein (a) comprises an amino acid sequence of a member of the myosin or kinesin protein superfamilies, or an analog, fragment or derivative thereof.
 6. The recombinant protein of claim 5, wherein (a) is chosen from the group consisting of amino acid sequences of a member of the myosin I, II, III, IV, V, VI, VIII, X or XI families, or the kinesin I or II families, or an analog, fragment derivative thereof.
 7. The recombinant protein of claim 5, wherein (a) is an amino acid sequence for the motor domain of a member of the myosin or kinesin protein superfamilies, or an analog, fragment or derivative thereof.
 8. The recombinant protein of claim 3, wherein (c) comprises a sequence of 3 amino acids, wherein Gly is in the second position.
 9. The recombinant protein of claim 4, wherein (b) comprises the amino acid sequence of an esterase, hydrolase, phosphatase, kinase, protease, channel, structural protein, receptor, transcription factor, DNA/RNA-binding protein, lipoprotein or glycoprotein, or an analog, derivative or fragment thereof.
 10. The recombinant protein of claim 9, wherein (b) is selected from the group consisting of the structural proteins coronin or spectrin, and a neuronal or immunologically relevant receptor.
 11. The recombinant protein of claim 1, wherein (a) comprises an amino acid sequence of SEQ ID NO.
 1. 12. The recombinant protein of claim 11, wherein (c) comprises the amino acid sequence Leu-Gly-Ser.
 13. The recombinant protein of claim 1, further comprising a tag sequence at the N- or C-terminus of the protein.
 14. A DNA sequence comprising an amino acid sequence that codes for the recombinant protein of claim
 1. 15. An expression vector comprising the DNA sequence of claim
 14. 16. An expression vector of claim 15, capable of expression in an eukaryotic host cell.
 17. The expression vector of claim 16, capable of expression in cells of Dictyostelium.
 18. A transformed eukaryotic host cell comprising a vector of claim
 16. 19. A transformed eukaryotic host cell comprising a vector of claim
 17. 20. A method for producing a recombinant protein according to claim 1, the method comprising the steps of: (a) preparing an expression vector comprising a DNA sequence that codes for the recombinant protein of claim 1; (b) transforming eukaryotic host cells with a vector obtainable from step (a); and (c) growing transformed host cells obtainable from step (b) under conditions suitable for the expression of said recombinant protein.
 21. A method for purifying a recombinant protein according to claim 1, the method comprising the steps of: (a) preparing an expression vector comprising a DNA sequence that codes for the recombinant protein of claim 1; (b) transforming eukaryotic host cells with a vector obtainable from step (a); (c) growing transformed host cells obtainable from step (b) under conditions suitable for the overexpression of said recombinant protein; (d) purifying overexpressed recombinant protein by binding to endogenous actin or microtubules of the eukaryotic host cell; and (e) specifically releasing bound recombinant protein from the actin or microtubules.
 22. The method of claim 21, wherein (e) comprises releasing the recombinant protein by adding a natural substrate of component (a) of claim
 1. 23. The method according to claim 22, wherein the natural substrate is ATP.
 24. The method of claim 21, further comprising at least one additional purifying step, chosen from biochemical, chromatographic and physical methods, or combinations thereof.
 25. The method of claim 21, wherein the additional purification step comprises affinity chromatography.
 26. The method of claim 25, wherein the affinity chromatography utilizes metals or antibodies as ligands.
 27. A method for crystallizing a recombinant protein, the method comprising the steps of: (a) purifying the recombinant protein according to the method of claim 21; and (b) crystallizing the purified recombinant protein obtained in step (a).
 28. A protein crystal having a crystal lattice formed by a network of recombinant proteins of claim
 1. 29. A method for elucidating the atomic structure of a protein crystal, the method comprising the steps of: (a) crystallizing a recombinant protein according to the method of claim 27; (b) collecting X-ray diffraction data for the protein crystal obtained in step (a); and (c) calculating the atomic structure of the recombinant protein by transformation of the data obtained in step (b). 